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This  final  progress  report  summaries  the  main  recent  results;  full  reports  of  the  CRA4I 
results  are  contained  in  the  papers  appended  herewith.  The  summary  also 
reviews  some  results  from  previous  AFOSR  grants  where  these  are  necessary  ounced 


to  provide  the  background  for  the  current  research.  Four  areas  are 
summarized: 


cation 


I 

□ 


1.  Basic  Mechanisms  of  Visual  Motion  and  Texture  Perception 

2.  Lateral  Intentions  in  Texture  Stimuli 

3.  Information  Processing 

4.  Visual  Attention  and  Short-Term  Memory. 


1.  Basic  Mechanisms  of  Visual  Motion  and  Texture  Perceptior 
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This  project  concerned  the  discovery  and  description  of  basic  mechanisms  of  human  visual 
motion  and  texture  perception.  Motion  and  texture  are  critical  inputs  to  visual  perception.  Basic 
mechanisms  of  motion  are  of  particular  interest  because  they  arc  perhaps  the  primary  substrate  for 
perceptual  recovery  of  3D  depth  structures  and  orientation  in  space,  they  are  critical  for  detecting 
new  objects  and  events  in  the  environment,  as  well  as  playing  an  important  role  in  2D  perception. 

Motion  and  texture  are  considered  together  here  because  the  problem  of  discriminating 
velocity  in  a  one-dimensional  motion  stimulus  is  formally  equivalent  to  the  problem  of 
discriminating  orientation  in  a  texture  stimulus:  the  t  dimension  of  the  motion  stimulus  becomes 
the  y  dimension  of  the  texture  stimulus. 


First-Order  Motion  Perception 


First-order  motion  perception.  The  initial  studies,  carried  out  at  the  inception  of  AFOSR 
supp>ort,  succeeded  in  describing  the  basic  mechanism  of  human  Fourier  motion  perception  in  full 
mathematical  detail.  Several  critical  insights  made  this  possible.  The  most  important  was 
recognizing  that  the  failure  of  previous  theoretical  attempts  to  apply  Reichardt  (1957)  and  similar 
systems  models  to  human  vision  (e.g.  Foster,  1971)  was  due  in  large  measure  to  the  fact  that  they 
had  dealt  with  data  obtained  with  high-contrast  visual  stimuli.  The  human  motion-processing 
system  behaves  in  a  simple  way  for  stimuli  whose  contrast  is  less  than  about  0.04  to  0.05  (e.g. 
Nakayama  &  Silverman,  1985,  others).  For  higher  contrasts,  early  nonlinearities  in  the  visual 
system  make  the  analysis  the  motion  processing  enormously  more  complex.  Additionally,  because 
hundreds  of  thousands  of  detectors  may  contribute  to  human  psychophysical  responses,  formal 
models  need  to  explicitly  model  decision  processes.  Finally,  stimuli  needed  to  be  developed  that 
permitted  conclusions  about  basic  motion  computations  independent  of  the  voting/decision  rules 
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imposed  by  higher-order  processes. 

van  Santen  &  Sperling  (1984)  was  perhaps  the  first  successful  application  of  these  basic 
principles  first-order  motion  perception  to  humans,  principles  that  are  now  quite  widely  accepted 
van  Santen  &  Sperling  (1985)  showed  the  equivalence  of  two  subsequent  models  (Adelson  & 
Bergen,  Watson  &  Ahumada)  to  the  van  Santen-Sperling  version  of  the  Reichardt  model  and  it 
developed  new  results. 

van  Santen,  J.  P.  H.  and  Sperling,  G.  (1984).  Temporal  covariance  model  of  human  motion 
perception.  Journal  of  the  Optical  Society  of  America  -  A,  1,  451-473. 

The  first  of  two  papers  by  van  Santen  and  Sperling  reports  that,  by  elaborating  a  Reichardt 
model  that  had  previously  been  proposed  for  insect  vision,  the  model  gives  an  excellent  account 
of  human  psychophysical  data  for  low-contrast  stimuli.  To  apply  a  Reichardt  detector  to  human 
vision  requires  in  considering  voting  rules  (e.g.,  absolute  maximum  or  total  power)  for  detectors 
because  many  detectors  present  possibly  conflicting  information  to  the  decision  stage.  There  is  a 
full  mathematical  development  of  the  elaborated  theory.  Many  counter-intuitive  predictions  were 
generated  by  the  theory,  and  three  were  experimentally  tested.  (1)  A  superimposed  stationary 
grating,  even  of  a  grating  the  same  spatial  frequency  as  a  moving  grating,  should  not  adversely 
affect  motion-direction  discrimination.  (2)  Similarly,  a  stationary  flickering  grid  should  have  not 
affect  motion  discrimination  of  a  moving  stimuli  with  different  temporal  frequency.  When 
temporal  frequencies  of  the  moving  and  masking  stimuli  are  the  same,  then  anything  may  happen, 
even  an  illusion  of  motion  in  the  opposite  direction.  This  apparent  reversal  of  direction  of  the 
moving  grating  for  certain  predicatable  phase  relations  of  the  masking  stimulus  was  demonstrated 
experimentally.  (3)  For  certain  spatially-sampled  displays,  the  strength  of  a  motion  percept  is 
directly  proportional  to  the  product  of  the  contrast  in  adjacent  regions.  All  three  predictions  were 
verified.  These  data  show  that,  contrary  to  "logical  intuition,"  human  motion  detection  does  not 
rely  on  matching  spatial  features  in  successive  frames,  but  rather  on  matching  of  temporal 
sequences  in  adjacent  locations. 

van  Santen,  J.  P.  H.  and  Sperling,  G.  (1985)  Elaborated  Reichardt  detectors.  Journal  of  the 
Optical  Society  of  America  -  A,  2,  300-321. 

This  paper  extends  the  predictive  power  of  the  elaborated  Reichardt  model  from  continuous 
to  two-flash  stimuli,  and  to  other  displays,  such  as  random  dot  displays,  that  had  previously  been 
thought  to  require  "feature"  models.  It  points  out  that  the  Reichardt  model  is  consistent  with  a  3D 
spatiotemporal  Fourier  analysis  of  visual  displays.  However,  when  complex  displays  contain 
several  Fourier  components  of  approximately  equal  perceptual  strength,  a  more  complex  analysis 
such  as  that  of  the  elaborated  Reichardt  model,  is  needed  to  generate  predictions.  For  example, 
displays  in  which  component  Fourier  components  move  in  the  same  direction  and  at  the  same 
temporal  frequency  exhibit  as  more  convincing  movement  than  displays  in  which  the  components 
move  at  the  same  velocity  so  to  preserve  2D  rigidity.  It  was  proved  that,  for  elaborated  Reichardt 
detectors,  the  strength  of  motion  in  two  flash  displays  is  predicted  by  separable  temporal  and 
spatial  components,  so  that  these  displays  are  ideal  for  studying  the  pure  spatial  properties  of 
motion  detectors.  Finally,  it  was  proved  that  two  alternative  computational  theories  (Adelson  & 
Bergen,  1985  and  Watson  &  Ahumada,  1985)  for  which  no  experimental  data  had  yet  been 
generated,  were  computationally  equivalent  to  the  elaborated  Reichardt  model. 
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Investigations  of  Second-Order  Motion  and  Texture 

The  theoretical  analysis  and  experimental  evidence  described  above  establishes  an  elaborated 
Reichardt  (or  equivalent  kind  of  motion  computation)  as  the  basic  mechanism  of  motion 
perception.  The  work  of  the  current  granting  period  dealt  with  a  newly  discovered  second 
mechanism  of  motion  perception,  which  was  called  "Second-order"  or  "Non-Fourier"  motion 
processing  to  distinguish  it  from  the  previously  described  "First-order"  or  "Fourier"  motion 
perception.  The  computational  principles  that  applied  to  second-order  motion  perception  were 
found  also  to  apply  to  the  perception  of  two-dimensional  textures. 

Chubb,  Charles,  and  George  Sperling.  (1988).  Drift-balanced  random  stimuli:  A  general  basis  for 
studying  non-Fourier  motion  perception.  Journal  of  the  Optical  Society  of  America  A:  Optics  and 
Image  Science,  5.  1986-2006. 

This  paper  sets  forth  the  general  principles.  It  shows  how  to  construct  counterexamples  to 
first-order  motion  computations:  visual  stimuli  which  (i)  are  consistently  perceived  as  obviously 
moving  in  a  fixed  direction,  yet  for  which  (ii)  Fourier  domain  energy  analysis  yields  no  systematic 
motion  components  in  any  given  direction.  A  general  theoretical  framework  for  investigating 
nonFourier  (second-order)  motion-percepdon  mechanisms:  two  central  concepts  are  drif  t balanced 
and  microbalanced  random  stimuli.  A  random  stimulus  5  is  drift  balanced  if  its  expected  power 
in  the  frequency  domain  is  symmetric  with  respect  to  temporal  frequency:  that  is,  if  the  expected 
power  in  S  of  every  drifting  sinusoidal  component  is  equal  to  the  expected  power  of  the  sinusoid 
of  the  same  spatial  frequency,  drifting  at  the  same  rate  in  the  opposite  direction.  Additionally,  S 
is  microbalanced  if  the  result  WS  of  windowing  S  by  any  space-time  separable  function  W  is 
driftbalanced.  It  is  proved  that  (i)  any  space/time  separable  random  (or  nonrandom)  stimulus  is 
microbalanced;  (iia)  any  linear  combination  of  a  pairwise  independent  microbalanced  random 
stimuli  is  microbalanced,  and  any  linear  combination  of  a  pairwise  independent  driftbalanced 
random  stimuli  is  driftbalanced  if  the  expectation  of  each  component  is  zero  (a  uniform  field);  (iii) 
the  convolution  of  independent  micro/driftbalanced  random  stimuli  is  micro/driftbalanced;  (iv)  the 
product  of  independent  microbalanced  random  stimuli  is  microbalanced.  Examples  are  provided 
of  classes  of  driftbalanced  random  stimuli  which  display  consistent  and  compelling  motion  in  one 
direction  although  they  would  be  completely  ambiguous  to  any  first-order  motion  mechanism. 
The  perception  of  nonFourier  motion  stimuli  is  explained  by  postulating  a  linear  space-invariant 
filter  followed  by  a  rectifying  mechanism  that  computes  (any  increasing  function  of)  the  absolute 
value  of  stimulus  contrast  followed  by  Fourier-energy  (e.g.,  Reichardt)  motion  analysis.  All  the 
results  and  examples  from  the  domain  of  motion  perception  are  transposable  to  and  illustrated  in 
the  space-domain  problem  of  detecting  orientation  in  texture  patterns. 

Chubb,  Charles,  and  George  Sperling.  (1989).  Second-order  motion  perception;  Space-time 
separable  mechanisms.  Proceedings:  Workshop  on  Visual  Motion.  (March  20-22,  1989,  Irvine, 
California.)  Washington,  D.C:  IEEE  Computer  Society  Press.  Pp.  126-138. 

This  paper  shows  how  various  classes  of  microbalanced  displays  can  be  used  to  derive 
properties  of  second-order  motion  systems.  Microbalanced  stimuli  are  dynamic  displays  which  do 
not  stimulate  mechanisms  that  apply  standard  motion  analysis  directly  to  luminance  (e.g., 
Adelson-Bergen  motion-energy  analyzers,  Watson-Ahumada  motion  sensors,  or  elaborated 
Reichardt  detectors.)  Because  they  bypass  first-order  mechanisms,  microbalanced  stimuli  are 
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uniquely  useful  for  studying  second-order  motion  perception  (motion  perception  served  by 
mechanisms  that  require  a  grossly  nonlinear  stimulus  transformation  prior  to  standard  analysis). 
The  paper  demonstrates  stimuli  that  are  microbalanced  under  all  pointwise  stimulus 
transformations  and  therefore  immune  to  early  visual  nonlinearities.  Such  stimuli  are  used  to 
disable  motion  information  derived  from  spatial  filtering  in  order  to  isolate  the  temporal  properties 
of  space/time  separable  second-order  motion  mechanisms.  They  are  equally  useful  to  disable  the 
motion  information  derived  from  temporal  filtering  to  isolate  the  spatial  properties. 

The  paper  proposes  that  second-order  motion  of  all  of  the  classes  of  microbalanced  stimuli 
under  consideration  can  be  extracted  by  a  mechanism  consisting  of  the  following  stages:  (la) 
band- selective  spatial  filtering  and  (lb)  biphasic  temporal  filtering,  nonzero  in  dc,  followed  by  (2) 
a  rectifying  nonlinearity  and  (3)  standard  motion  analysis. 

Chubb,  Charles,  and  George  Sperling.  (1989).  Two  motion  perception  mechanisms  revealed  by 
distance  driven  reversal  of  apparent  motion.  Proceedings  of  the  National  Academy  of  Sciences, 
USA,  86,  2985-2989. 

It  is  reasonable  to  ask  whether  there  really  are  two  mechanisms  of  motion  perception  or 
whether  one  theory  can  encompasses  both.  One  way  to  demonstrate  the  existence  of  two 
mechanisms  is  to  stimulate  them  to  simultaneously  give  opposite  outputs  in  response  to  the  same 
stimulus.  This  paper  demonstrates  two  kinds  of  visual  stimuli  that  exhibit  motion  in  one  direction 
when  viewed  from  near  and  in  the  opposite  direction  from  afar.  These  striking  reversals  occur 
because  each  kind  of  stimulus  is  constructed  to  simultaneously  activate  two  different  mechanisms: 
a  short-range  mechanism  that  computes  motion  from  space-time  correspondences  in  stimulus 
luminance  and  a  long-range  mechanism  whose  motion  computations  are  performed,  instead,  on 
stimulus  contrast  that  has  been  full-wave  rectified  (e.g.,  the  absolute  value  of  contrast).  The 
stimuli  were  constructed  so  that  half-wave  rectification  could  be  excluded.  It  is  concluded  that 
both  a  Fourier  and  a  nonFourier  computation  occur.  In  this  and  all  previously  studied  cases  of 
2nd  order  motion  perception,  full  wave  rectification  has  been  shown  to  be  a  sufficient  mechanism; 
for  these  stimuli,  full  wave  rectification  (versus  half-wave  rectification)  is  shown  to  be  necessary. 

An  analogous  phenomenon,  distance-driven  reversal  of  apparent  slant,  occurs  with  texture 
stimuli.  Apparently,  in  both  motion  and  texture  extraction  from  visual  scenes,  there  are  two 
parallel  mechanisms,  operating  simultaneously,  a  first-order  mechanism  that  operates  directly  on 
the  Fourier  components  of  the  stimulus,  and  a  second-order  mechanism  that  operates  on  a 
spatiotemporally  filtered,  full-wave  rectified  transformation  of  the  stimulus. 

Chubb,  Charles,  and  George  Sperling.  (1991).  Texture  quilts:  Basic  tools  for  studying  motion- 
from-texture.  Journal  of  Mathematical  Psychology,  35.  4J 1-442. 

This  paper  continues  the  investigation  of  motion-from-spatial-texture  in  stimuli  that  are  free 
from  contamination  by  motion  mechanisms  sensitive  to  anything  except  texture.  It  offers  a  formal 
foundation  for  some  of  the  results  outlined  in  Chubb  &  Sperling’s  (1989)  IEEE  paper,  and  reports 
the  results  of  three  demonstration  experiments  that  establish  empirical  properties  of  human 
second-order  motion  perception.  Additionally,  some  concrete  stimulus-construction  methods  are 
provided  for  a  special  class  of  random  stimuli  called  texture  quilts.  Although,  as  is  demonstrated 
experimentally,  certain  texture  quilts  display  consistent  apparent  motion,  it  is  proven  that  their 
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motion  content  (a)  is  unavailable  to  standard  motion  analysis  (such  as  might  be  accomplished  by 
an  Adelson/Bergen  motion-energy  analyzer,  a  Watson/Ahumada  motion  sensor,  or  by  any 
elaborated  Reichardt  detector),  and  (b)  cannot  be  exposed  to  standard  motion  analysis  by  any 
purely  temporal  signal  transformation  no  matter  how  nonlinear  (e.g.,  temporal  differentiation 
followed  by  rectification).  Applying  such  a  purely  temporal  transformation  to  any  texture  quilt 
produces  a  spatiotemporal  function  P  whose  motion  is  unavailable  to  standard  motion  analysis: 
The  expected  response  of  every  Reichardt  detector  to  P  is  0  at  every  instant  in  time. 

Three  quilts  were  studied  experimentally:  a  quilt  that  relies  on  differences  in  spatial 
frequency  to  generate  perception  of  motion,  a  quilt  that  relies  on  sensitivity  to  differences  in 
orientation,  and  quilt  that  relies  on  the  difference  between  an  even  texture  and  a  jointly- 
independent  random  texture.  The  simplest  mechanism  sufficient  to  sense  the  motion  exhibited  by 
texture  quilts  consists  of  three  successive  stages:  (i)  a  purely  spatial  linear  filter  (ii)  a  rectifier  to 
transform  regions  of  large  negative  or  positive  responses  into  regions  of  high  positive  values,  and 
(iii)  standard  motion  analysis.  The  fisst  quilt  demonstrates  that  the  spatial  filter  is  frequency 
selective.  The  second  quilt  demonstrates  that  there  exist  orientation  selective  filters.  The  third 
quilt  demonstrates  that  the  rectifier  cannot  embody  a  perfect  squaring  (power)  function. 

Werkhoven,  Peter,  George  Sperling,  and  Chubb,  Charles.  (1993).  The  dimensionality  of  texture- 
defined  motion:  A  single  channel  theory.  Vision  Research,  33,  463-485. 

This  paper  explores  texture-defined  motion  between  similarly  oriented  sinusoidal  patches.  It 
exploits  two  ambiguous  motion  displays  (types  I  and  II)  in  each  of  which  apparent  motion  can  be 
perceived  in  either  of  two  directions.  One  of  these  directions  is  along  a  homogeneous  space-time 
path  in  which  all  successive  sinusoidal  patches  are  identical  in  spatial  frequency  and  contrast. 
Along  the  other,  oppositely  directed,  path  is  composed  of  heterogeneous  patches  that  vary  in 
spatial  frequency  and  contrast.  The  striking  and  counterintuitive  result  is  that  for  a  wide  variety  of 
display  conditions,  perceived  motion  along  the  heterogeneous  path  dominates  the  homogeneous 
path.  Obviously,  when  perceived  motion  along  a  path  composed  of  alternating  high-  and  low- 
frequency  patches  dominates  perceived  motion  along  a  pure  high-frequency  path,  the  strength  of 
texture-defined  motion  is  not  governed  by  a  similarity  metric. 

All  the  results  are  explained  in  terms  of  an  activity  transformation.  Each  patch  is  assumed 
to  cause  a  perceptual  response  (activity).  Strength  of  perceived  motion  along  a  path  is  determined 
by  the  product  of  the  activities  of  adjacent  patches  along  the  path.  The  path  with  the  greatest 
product  dominates. 

Whenever  a  particular  combination  of  patch  contrasts  and  spatial  frequencies  caused  the  two 
•  motion  paths  to  be  balanced  in  displays  of  type  I,  then  they  were  found  to  be  also  balanced  in 
type  II  displays,  a  condition  referred  to  as  transition  invariance.  Under  quite  reasonable 
assumptions  about  the  motion  mechanism,  it  was  shown  that  transition  invariance  implies  that 
activity  must  be  a  one-dimensional  quantity.  Indeed,  activity  is  well-described  as  the  rectified 
output  of  a  spatial  low-pass  filter. 

Werkhoven,  Peter,  Charles  Chubb,  and  George  Sperling.  (1994)  Perception  of  Apparent  Motion 
between  Dissimilar  Gratings:  Spatiotemporal  Properties.  Vision  Research.  (Accepted  for 
publication  pending  revisions.) 


Sperling:  AFOSR  Grant  91-0178 


Page  6 


Final  Report 


This  paper  continues  the  search  for  the  determinants  of  the  perceptual  strength  of  texture- 
defined  motion  (i.e.,  motion  strength  of  stimuli  that  have  no  net  directional  energy  in  the  Fourier 
domain).  Werkhoven.  Sperling,  &  Chubb  (1993)  demonstrated  that  correspondence  in  spatial 
frequency  and  contrast  between  neighboring  patches  of  texture  in  a  spatiotemporal  motion  path  is 
irrelevant  to  motion  strength,  only  activity — the  rectified  output  of  a  spatial  lowpass 
filter — mattered.  As  in  Werkhoven  et  al  (1993),  the  motion  stimuli  are  ambiguous  motion 
displays  in  which  one  motion  path,  consisting  of  patches  of  nonsimilar  texture,  competes  with 
another  motion  path,  having  patches  only  of  similar  texture.  The  textural  parameters  of  spatial 
frequency,  contrast,  texture  orientation  (slant),  and  temporal  frequency  are  systematically  explored. 

The  data  show  that  motion  between  dissimilar  patches  of  texture  (which  are  orthogonally 
oriented,  have  a  two  octave  difference  in  spatial  frequency  and  differ  50%  in  contrast)  can  easily 
dominate  motion  between  similar  patches  of  texture.  The  relative  motion  strengths  of  two  paths  is 
invariant  with  temporal  frequency  from  1  to  4  Hz.  Analysis  of  the  data  shows  that  the  motion 
computation  is  largely  but  not  entirely  one-dimensional:  Extreme  orientation  differences  and  very 
large  spatial  frequency  differences  bring  into  play  small  but  significant  contributions  of  a  second 
dimension  (or  dimensions). 


2.  Lateral  Interactions  in  Texture  Stimuli:  Contrast-Contrast 

Chubb,  Charles,  George  Sperling,  and  Joshua  A.  Solomon.  (1989).  Texture  interactions  determine 
perceived  contrast.  Proceedings  of  the  National  Academy  of  Sciences,  USA,  86,  9631-9635. 

Various  visual  illusions  that  have  been  demonstrated  for  first-order  stimuli,  may  be  expected 
to  have  corresponding  second-order  illusions.  When  the  illusions  are  the  result  of  important 
properties  of  signal  processing,  such  as  boundary  enhancement  and  gain  control,  the  corresponding 
second-order  illusions  should  be  quite  informative  about  the  corresponding  second-order  process.. 
This  paper  considers  the  second-order  analog  to  perhaps  the  most  famous  first-order  lightness 
illusion,  namely  that  the  apparent  lightness  of  a  uniformly  illuminated  patch  depends  on  the 
luminance  of  its  surround.  Here  it  is  reported  that  the  perceived  contrast  of  a  test  patch  P  of 
binary  visual  noise  embedded  in  a  surrounding  noise  field  5  depends  substantially  on  the  contrast 
of  5.  When  P  is  suirounded  by  high-contrast  noise,  its  bright  points  appear  dimmer,  and 
simultaneously,  its  dark  points  appear  less  dark  than  when  P  is  surrounded  by  a  uniform  field, 
even  though  local  mean  luminance  is  kept  constant  across  all  displays.  Sinusoidally  modulating 
the  contrast  P5  of  the  noise  surround  S  causes  the  apparent  contrast  of  P  to  modulate  in  antiphase 
to  Cs.  For  P  of  contrast  C^,  nulling  procedures  show  that  the  induced  induced  contrast 
modulation  of  P  reaches  0.45  Cp.  This  very  large,  heretofore  unnoticed,  spatial  interaction  is 
unanticipated  by  all  current  theories  of  lightness  perception.  It  suggests  a  very  general  principle 
of  perceptual  computation:  gain  control.  Gain  control  may  be  be  a  nearly  universal  process 
whereby  the  response  of  all  a  detector  is  normalized  relative  to  the  responses  of  their  neighbors  in 
the  same  and  similar  classes. 

Joshua  A.  Solomon  and  George  Sperling.  (1993).  The  lateral  inhibition  of  perceived  contrast  is 
indifferent  to  on-center/off-center  segregation  but  specific  to  orientation.  Vision  Research,  33, 
2671-2683. 
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Chubb,  Sperling,  and  Solomon  (1990)  showed  that  the  perceived  contrast  of  a  test  patch  of 
isotropic  spatial  texture  P  embedded  in  a  surrounding  texture  field  S,  depends  substantially  on  the 
contrast  of  the  texture  surround  S.  When  P  is  surrounded  by  a  high  contrast  texture  with  a 
similar  spatial  frequency  cont  :nt,  it  appears  to  be  have  less  contrast  than  when  it  is  surrounded  by 
a  uniform  field.  This  paper  describes  two  novel  textures:  T*  which  is  designed  to  selectively 
stimulate  only  the  on-center  system,  and  T~,  the  off-center  system.  When  the  type  of  C  and  of  5 
is  chosen  to  be  T'*’  or  T~,  the  reduction  of  C’s  apparent  contrast  does  not  vary  with  the 
combination  of  T*,  T~.  This  demonstrates  that  the  reduction  of  C’s  apparent  contrast  is  mediated 
by  a  mechanism  whose  neural  locus  is  central  to  the  interaction  between  on-center  and  off-center 
visual  systems. 

The  induced  reduction  of  apparent  contrast  is  shown  to  be  orientation  specificity:  the 
reduction  of  grating  C’s  apparent  contrast  by  a  surround  grating  S’,  of  the  same  spatial  frequency 
is  greatest  when  C  and  S  have  equal  orientation.  Using  dynamically  phase-shifting  sinusoidal 
gratings  of  3.3,  10  and  20  cpd,  the  reduction  of  apparent  contrast  was  measured  using  different 
contrast-combinations  of  C  and  S. 

The  results:  (1)  Both  parallel  and  orthogonal  S  gratings  caused  suppression  of  P’s  apparent 
contrast  relative  to  a  uniform  surround.  (2)  In  all  of  the  viewing  conditions,  the  reduction  of 
apparent  contrast  induced  by  the  parallel  surrounds  was  at  least  as  great  as  that  induced  by  the 
perpendicular  surrounds.  Often  it  was  much  greater  (orientation  specificity).  (3)  Orientation 
specificity  increased  with  greater  spatial  frequencies  and  with  lower  stimulus  contrasts.  The 
results  suggest  a  contrast  perception  mechanism  in  which  both  oriented  and  nonoriented  units 
determine  the  perceived  lightness  or  darkness  of  a  point  in  visual  space,  and  every  unit  is 
inhibited  primarily  by  similar  adjacent  units. 


3.  Information  Processing:  Frequency  Bands,  Subsampling,  Noise;  Space  and  Object  Perception 

This  cluster  of  projects  determined,  in  several  domains,  how  to  most  efficiently  package 
information  to  an  observer.  Obviously,  issues  of  external  representation  of  information  are 
inextricably  tied  to  the  question  of  "What  internal  representation  does  the  observer  use?"  Such 
investigations  may  lead  to  useful  formulations  of  how  to  improve  both  information  presentation 
and  observer  training.  The  basic  method  was  to  partition  the  total  stimulus  information  into 
several  spatial  frequency  bands,  and  to  determine  performance  individually  for  the  component 
bands.  Additionally,  Riedl  and  Sperling  studied  cross-band  masking  and  measure  how 
information  from  component  frequency  bands  combines  in  a  complex,  dynamic  visual  stimulus. 

The  "Three-stages  and  two  systems"  paper  in  this  sequences  proposes  a  theoretical  analysis 
of  the  basic  computations  of  visual  preprocessing.  It  shows  how  results  from  motion  and  texture 
discrimination  experiments  derive  from  the  same  mechanisms  that  serve  higher-order  object  object 
perception.  The  eye  movement  paper  in  this  sequence  deals  with  the  internal  representation  of 
scenes  that  derive  from  a  sequence  of  saccadic  eye  movements,  and  with  the  visual  mechanisms 
that  serve  the  saccadic  mode  of  information  acquisition. 
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Sperling,  Wurst  &  Lu  deal  with  a  new  method  of  discriminating  early  from  late  attentional 
filtering  of  features  that  occur  within  at  a  single  location  .  Their  paradigm,  which  was  applied  to 
repetition  detection  task,  is  easily  be  extended  to  visual  search,  and  this  forms  the  basis  of  the 
proposed  experiments. 


Riedl,  Thomas  R.  and  George  Sperling.  Spatial  frequency  bands  in  complex  visual  stimuli: 
American  Sign  Language  Journal  of  the  Optical  Society  of  America  A:  Optics  and  Image  Science, 
1988,  5.  606-616. 

This  project  examined  dynamic  images  of  individual  signs  of  American  Sign  Language 
(ASL)  with  a  resolution  of  96  x  64  pixels  which  were  bandpass  filtered  in  adjacent  frequency 
bands.  InielUgibilily  was  determined  by  testing  deaf  subjects  fluent  in  ASL.  (a)  It  was  possible 
to  find  four  adjacent  bands  which  divided  the  signal  into  approximately  equally  intelligible  parts, 
any  one  of  which  yielded  adequate  identification  accuracy  (a)  By  iteratively  varying  the  center 
frequencies  and  bandwidths  of  the  spatial  bandpass  filters,  it  was  possible  to  divide  the  original 
signal  into  four  different  component  bands  of  high  intelligibility  (67-87%  for  isolated  ASL  signs), 
(b)  The  empirically  measured  temporal  frequency  spectrum  was  approximately  the  same  in  all 
bands,  (c)  The  masking  of  signals  in  band  i  by  noise  in  band  j  was  found  to  be  proportional  to 
the  frequency  similarity:  log  Kfnoue  ^  f  signal  constant  performance, 

{RMS)fig^t  I  (RMS)noise  same  for  bands  2,  3,  4  and  higher  for  band  1.  (d)  The  most 

effective  masking  noise  is  slightly  lower  in  spatial  frequency  than  stimulus  (Aco=1.4).  (e) 
Intelligibility  for  the  sum  of  two  very  weak  signals  is  greater  the  closer  they  are  in  spatial 
frequency;  for  strong  signals,  the  reverse  is  true.  The  dominant  factor  for  weak  signals  is 
square-law  additivity  of  signal  power;  for  strong  signals,  redundancy  within  a  band  is  the  limiting 
factor. 


Parish,  David  H.  and  George  Sperling.  Object  spatial  frequencies,  retinal  spatial  frequencies, 
noise,  and  the  efficiency  of  letter  discrimination.  Vision  Research,  1991,  31,  1399-1415. 

The  26  upper-case  letters  of  English  were  used  to  determine  which  spatial  frequencies  are 
most  effective  for  letter  identification,  and  whether  this  is  because  letters  are  objectively  more 
discriminable  in  these  frequency  bands  or  because  observers  can  utilize  the  information  more 
efficiently.  Six  two-octave  wide  filters  produced  spatially  filtered  letters  with  2D-mean 
frequencies  ranging  from  0.4  to  20  cycles  per  letter  height.  Subjects  attempted  to  spatially  filtered 
letters  in  the  presence  of  identically  filtered,  added  Gaussian  noise.  The  percent  of  correct  letter 
identifications  was  measured  as  a  function  of  shi  in  each  band  at  each  of  four  viewing  distances 
ranging  over  32:1.  In  this  paradigm,  object  spatial  frequency  band  and  stn  determine  presence  of 
information  in  the  stimulus;  viewing  distance  determines  retinal  spatial  frequency,  and  affects  only 
ability  to  utilize,  (a)  Viewing  distance  had  no  effect  upon  letter  discriminability:  object  spatial 
frequency,  not  retinal  spatial  frequency,  determined  discriminability.  (b)  With  the  assistance  of 
Charles  Chubb,  an  ideal  detector  was  computed  for  the  letter  identification  task.  For  these  two- 
octave  wide  bands,  s/n  performance  of  humans  and  of  the  ideal  detector  improved  with  frequency 
mainly  because  linear  bandwidth  increased  as  a  function  of  frequency,  (c)  Human  discrimination 
efficiency  (which  compares  human  discrimination  to  an  ideal  discriminator)  was  0  in  the  lowest 
frequency  bands,  reached  a  maximum  of  0.42  at  1.5  cycles  per  object,  and  dropped  to  about  .104 
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in  the  highest  band,  (d)  Upper-cuse  letter  information  is  best  extracted  from  spatial  frequencies  of 
1.5  cycles  per  object  height,  an  with  equal  high  efficiency  over  at  least  a  32:1  range  of  retinal 
frequencies  from  .074  to  more  than  2.3  cycles  per  degree  of  visual  angle. 


Parish,  David  H.,  George  Sperling,  and  Michael  S.  Landy.  Intelligent  temporal  subsampling  of 
American  Sign  Language  using  event  boundaries.  Journal  of  Experimental  Psychology:  Human 
Perception  and  Performance,  1990,  16,  282-294. 

This  paper  investigates  the  effects  of  temporal  stimulus  subsampling  and  the  form  of 
stimulus  representation  on  intelligibility  of  a  complex  visual  stimulus  (American  Sign  Language). 
How  well  can  a  sequence  of  ASL  frames  be  represented  by  a  subset  of  the  frames,  and  how  is  the 
subset  optimally  chosen?  Two  drastically  different  representations  of  frame  sequences  were 
investigated:  dynamic  (ordinary  video  viewing)  and  static  (component  frames  placed  side-by-side 
in  a  single  display).  Secondarily,  full  gray  scale  images  were  compared  with  binary  images 
(cartoons).  An  activity- index  was  used  to  select  critical  frames  at  event  boundaries-moments  in 
the  sequence  where  the  difference  between  successive  frames  has  a  local  minimum.  Identification 
accuracy  (intelligibility)  was  measured  for  32  experienced  ASL  signers  who  viewed  84  variously 
constructed  sequences  of  isolated  ASL  signs.  With  dynamic  sequences  that  utilized  full  gray¬ 
scale,  activity-index  subsampling  yielded  significantly  more-intelligible  sequences  than  simple 
repetition  of  every  «-th  frame,  achieving  relative  compression  ratios  of  up  to  2:1,  For  static 
sequences,  activity  subsampling  with  a  small,  optimal  number  of  frames  achieved  higher 
intelligibility  than  was  achieved  by  choosing  every  n-th  frame,  for  any  n.  Binary  images  were 
less  intelligible  than  the  gray  scale  images,  and  the  relative  advantage  of  activity  subsampling  was 
smaller. 

(1)  Event  boundaries  can  be  defined  computationally.  Sequences  composed  of  frames 
chosen  from  event  boundaries  yielded  higher  intelligibility  than  sequences  composed  of  equal 
numbers  of  frames  spaced  at  regular  intervals.  (2)  Static  presentation  of  subsets  of  selected 
frames  can  yield  intelligible  ASL  "text"  of  isolated  signs  and  perhaps,  eventually,  of 
conversational  ASL. 


This  research  opens  the  general  question  of  how  to  use  printing  technology  in  place  of  video 
technology,  where  the  printing  technology  is  enhanced  at  the  point  of  production  by  computer 
graphics  techniques.  How  can  an  automatically  generated  sequence  of  images  best  be  used  —  like 
a  comic  book  —  to  represent  a  dynamic  sequence  of  events.  When  an  artist  is  required  to 
represent  the  images  for  eventual  printing,  the  cost  can  be  prohibitive.  When  the  images  can  be 
automatically  generated  from  a  video  recording,  the  production  costs  are  minor.  The  ASL  study 
demonstrates  the  feasibility  of  representing  a  dynamic  ASL  sign  by  a  simultaneously  visible 
packet  of  images.  Research  is  needed  to  determine  how  these  results  might  be  generalized  to 
more  complex  communications  and  to  practical  training  problems  that  involve  dynamic  actions. 


Sperling,  George.  Three  stages  and  two  systems  of  visual  processing.  Spatial  Vision,  1989,  4, 
183-207. 
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This  paper  offers  a  theoretical  synthesis  of  classic  work  on  light  adaptation  and  on  visual 
thresholds  for  pattern  stimuli,  work  on  efficiency  of  identification  in  various  spatial  frequency 
bands,  and  work  on  motion  and  texture  perception,  in  terms  of  three  stages  and  two  systems  of 
visual  processing.  The  initial  question  is;  How  would  an  internal  noise  (at  various  levels  of 
perceptual  processing)  appear  to  external  observer?  This  is  determined  by  the  internal  location  of 
the  noise  relative  to  three  stages  of  visual  processing:  light  adaptation,  contrast  gain  control,  and  a 
postsensory/decision  stage.  Dark  noise  occurs  prior  to  adaptation,  determines  dark-adapted 
absolute  thresholds,  and  mimics  stationary  external  noise.  Sensory  noise  occurs  after  dark 
adaptation,  determines  contrast  thresholds  for  sine  gratings  and  similar  stimuli,  and  mimics 
external  noise  that  increases  with  mean  luminance.  Postsensory  noise  incorporates  perceptual, 
decision,  and  mnemonic  processes.  It  occurs  after  contrast-gain  control  and  mimics  external  noise 
that  increases  with  stimulus  contrast  (i.e.,  multiplicative  noise),  and  therefore  mimics  external 
multiplicative  noise.  Dark  noise  and  sensory  noise  are  frequency  specific  and  primarily  affect 
weak  signals.  Only  postsensory  noise  significantly  affects  the  discriminability  of  strong  signals 
masked  by  stimulus  noise;  postsensory  noise  has  constant  power  over  a  wide  spatial  frequency 
range  in  which  sensory  noise  varies  enormously.  Especially  in  dealing  with  modulation  transfer 
functions,  there  has  been  considerable  confusion  over  the  spectrum  of  internal  sensory  noise 
(which  unavoidably  depends  on  spatial  frequency)  with  the  gain  factor  of  sensory  transmission 
(which  ideally  would  be  independe  it  of  spatial  frequency). 

Two  parallel  perceptual  regimes  jointly  serve  human  object  recognition  and  motion 
perception:  a  first-order  linear  (Fourier)  regime  that  computes  relations  directly  from  stimulus 
luminance,  and  a  second-order  nonlinear  (nonFourier)  rectifying  regime  that  uses  the  absolute 
value  (or  power)  of  stimulus  contrast.  When  objects  or  movements  are  defined  by  high  spatial 
frequencies  (i.e.,  texture  carrier  frequencies  whose  wavelengths  are  small  compared  to  the  object 
size),  the  responses  of  high-frequency  receptors  are  demodulated  by  rectification  to  facilitate 
discrimination  at  the  higher  processing  levels.  Rectification  sacrifices  the  statistical  efficiency 
(noise  resistance)  of  the  first-order  regime  for  efficiency  of  connectivity  and  computation. 


Sperling,  George.  Comparison  of  perception  in  the  moving  and  stationary  eye.  In  E.  Kowler 
(Ed),  Eye  Movements  and  their  Role  in  Visual  and-  Cognitive  Processes.  Amsterdam,  The 
Netherlands:  Elsevier  Biomedical  Prcas,  1990.  Pp.  307-351. 

This  paper  reports  the  construction  of  an  apparatus  for  producing  simulated  saccades-- 
continuous  sequences  of  images  on  a  stationary  retina  that  are  equivalent  to  the  images  produced 
on  the  retina  during  saccadic  eye  movements.  Spatial  localization  was  studied  for  stimuli  Hashed 
during  real  eye  movements  (using  a  limbus  monitor)  and  during  identical  image  sequences 
(simulated  saccades)  produced  on  a  stationary  retina.  The  comparison  between  real  and  simulated 
saccades  gives  critical  insights  into  those  mechanisms  that  are  particular  to  saccades.  The  paper 
reviews  the  historically  important  paradigms  (and  representative  experiments)  that  purport  to  deal 
with  special  modes  of  saccadic  processing.  On  the  basis  of  all  these  data,  it  proposes  a  theory  to 
account  for  saccadic  simulation  experiments  and  to  deal  with  such  questions  about  human  visual 
perception  as: 

Why  don’t  we  see  the  smear  produced  on  the  retina  during  an  eye  movement? 

Why  doesn’t  the  world  appear  to  move  as  a  result  of  the  image  movements  produced  by  eye 
movements? 
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Does  the  visual  system  require  sudden  stimulus  onsets  (such  as  those  produced  by  eye 
movements)  to  initiate  processing  episodes? 

To  serve  the  perceptual  construction  of  a  stable  representation  of  the  world,  is  there  a  special 
memory  to  relate  images  produced  by  successive  eye  movements? 


Sperling,  G.  Wurst,  S.  A.,  and  Lu,  Z-L.  (1993).  Using  repetition  detection  to  define  and  localize 
the  processes  of  selective  attention.  In  D.  E.  Meyer  and  S.  Komblum  (Eds.),  Attention  and 
Performance  XIV:  Synergies  in  Experimental  Psychology,  Artificial  Intelligence,  and  Cognitive 
Neuroscience  -  A  Silver  Jubilee  Cambridge,  MA;  MIT  Press.  Pp.  265-298. 

Can  subjects  selectively  attend  to  a  subset  of  items  in  rapid  display  sequences,  when  the 
subset  is  characterized  by  an  obvious  physical  feature,  but  all  items  occur  in  the  same  location. 
The  paradigm  is  a  repetition  detection  task  in  which  subjects  search  a  very  rapidly  presented 
sequence  of  thirty  superimposed  frames  for  an  item  that  is  repeated  within  four  frames. 
Successful  detection  implies  that  a  match  occurs  between  an  incoming  item  and  a  recent  item 
retained  in  short-term  visual  repetition  memory  (STVRM).  Previous  results  (Kaufman,  1978, 
Wurst,  1989)  showed  that  detection  of  visual  repetitions  in  a  rapid  stream  of  items  is  indifferent  to 
eye  of  origin  and  to  interposed  masking  fields,  and  functions  as  well  for  nonsense  shapes  as  for 
digits.  Therefore,  STVRM  is  visual,  not  verbal  or  semantic.  It  is  governed  by  interference  from 
new  items;  it  does  not  suffer  passive  decay  within  the  short  interstimulus  intervals  under  which  it 
has  been  tested. 

This  paper  uses  a  novel  elaboration  of  a  repetition  detection  paradigm.  Within  the  stream, 
the  physical  features  of  the  successive  items  alternate  in  color,  size  or  spatial  frequency.  For 
example,  in  the  size  condition,  the  odd-numbered  items  in  the  stream  are  large  and  the  even- 
numbered  items  are  small.  Subjects  attend  selectively  to  small  (or  to  large)  items.  Using 
selective  attention  instructions  with  the  repetition  detection  task  permits  testing  the  extent  to 
which,  at  a  single  location,  subjects  can  filter  rapidly-successive  items  according  to  their  physical 
characteristics.  By  presenting  all  the  items  at  the  same  location,  only  attentional  selection 
according  to  features  (and  not  according  to  location)  is  effective.  Subjects  selectively  attended  to 
subsets  of  characters  based  on  physical  differences  of  orientation,  contrast  polarity,  color,  size, 
spatial  bandpass  filtering,  and  polarity-and-size  combined. 

Results.  Efficiency  of  attentional  selection  was  determined  by  comparing  performance  in  a 
stream  of  characters  that  alternated  a  physical  feature  with  performance  in  two  control  conditions: 
One  in  which  the  to-be-unattended  characters  were  optically  filtered  and  another  in  which  all 
characters  shared  the  same  physical  feature.  Selection  efficiency  in  bandpass  filtered  streams  and 
in  the  polarity-and-size  streams  was  greater  than  50  percent.  Attentional  selection  based  on  the 
other  physical  features  was  less  effective  or  ineffective. 

Corresponding  to  the  benefits  of  attentional  selection  in  detecting  to-be-attended  repetitions, 
there  were  large  costs  in  the  detection  of  unattended  features.  Costs  were  more  ubiquitous  than 
benefits. 


In  addition  to  studying  repetitions  of  items  that  shared  a  physical  feature  (homogeneous 
repetitions)  heterogeneous  repetitions  were  studied.  Costs  for  detecting  heterogeneous  repetitions 
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(relative  to  homogeneous  repetitions)  were  widespread,  indicating  that  physical  features  are 
represented  in  STVRM.  The  corresponding  stimulus  benefits  of  detecting  homogeneous 
repetitions  in  feature-alternating  streams  (under  equal  attention)  were  small  and  only  occasionally 
significant. 

If  the  state  of  attention  were  represented  in  STVRM,  we  would  expect  a  cost  in  the  detection 
of  heterogeneous  repetitions  with  selective  attention  instructions  (because  the  attentional  state 
would  differ  for  the  two  elements  of  the  pair).  Such  costs  were  observed  and,  in  some  instances 
they  occurred  even  when  there  was  no  corresponding  benefit  for  selective  attention  in 
homogeneous  detections.  This  was  interpreted  as  a  lack  of  early  attentional  filtering  compensated 
by  a  memory  tag  representing  whether  or  not  an  item  was  attended. 

Conclusion:  The  largest  attentional  effects  occur  at  the  level  of  attentional  selection  prior  to 
encoding  in  STVRM  (for  bandpass  and  polarity-and-size  stimuli)  but  that,  even  when  early 
attentional  filtering  fails,  it  can  still  occur  in  STVRM. 


4.  Visual  Attention  and  Short-Term  Memory 

Performance  in  many  visual  tasks  depends  not  only  on  characteristics  of  the  visual  system, 
but  also  on  more  cognitive  processes  involved  in  processing  visual  information,  such  as  attention 
and  memory.  The  experiments  seek  to  dissect  the  processes  involved  in  short-term  attentional 
control  and  the  corresponding  short-term  memory  systems.  The  experimental  methods  mostly 
invtjlve  rapid  sequences  of  displays  because  our  past  work  has  shown  that  temporal  sequences  can 
be  used  to  sample  the  time  course  of  temporal  processing.  The  work  on  visual  persistence,  iconic 
memory,  and  related  phenomena  exemplifies  processing  in  the  absence  of  successive  events;  i.e., 
single-event  processing. 


Backgrowid 

The  attention  experiments  herein  and  many  prior  experiments  from  the  vast  literature  on 
visual  attention  are  encompassed  in  a  general  theoretical  framework.  The  starting  point  is  the  first 
published  demonstration  of  an  attentional  operating  characteristic  (Sperling  and  Melchner,  1976, 
1978a)  and  the  concept  of  attentional  resources  developed  by  Navon  and  Gopher  (1979),  Norman 
and  Bobrow  (1975),  and  others. 

Sperling,  G.  A  unified  theory  of  attention  and  signal  detection.  In  R.  Parasuraman  and  D.  R. 
Davies  (Eds.),  Varieties  of  Attention.  New  York,  N.  Y.:  Academic  Press,  1984.  Pp.  103-181.  A 
state  of  attention  is  characterized  by  a  particular  allocation  of  processing  and  mnemonic  resources, 
and  this  allocation  determines  the  joint  performance  on  two  (or  more)  competing  tasks.  The 
Attention  Operating  Characteristic  (AOC)  is  the  range  of  possible  joint  performances  as  resource 
allocation  is  varied  from  one  extreme  to  the  other.  This  paper  demonstrates  that  the  AOC  is 
generated  by  a  process  that  is  mathematically  equivalent  to  the  process  that  generates  the  receiver 
operating  characteristic  (ROC)  of  signal  detection  theory  (i.e.,  the  process  partitions  observations 
into  either  signal  or  noise  response  categories). 
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This  article  also  proposes  a  formal  definition  of  a  task  as  a  triple  of  two  sets  (stimuli  and 
responses)  and  a  mapping  between  them  (a  utility  function).  The  task  definition  enabled  a 
distinction  between  compound  and  concurrent  tasks.  Concurrent  tasks  were  shown  to  be 
especially  useful  in  the  study  of  attention,  whereas  compound  tasks  involved  primarily  the  study 
of  decision  making,  and  resulted  in  considerable  difficulties  when  they  were  applied  to  attention. 
The  utility  function  (in  the  task  definition)  is  essential  to  understanding  human  performance.  In 
contemporary,  formal  theory,  "utility"  plays  the  same  role  as  did  "purpose"  in  earlier,  informal 
accounts  of  behavior, 

Sperling,  G.,  and  B.  A.  Dosher.  (1986).  Strategy  and  optimization  in  human  information 
processing.  In  K.  Boff,  L.  Kaufman,  and  J.  Thomas  (Eds.),  Handbook  of  Perception  and 
Performance.  Vol.  1.  New  York,  NY:  Wiley,  1986.  Pp.  2-1  to  2-65. 

This  highly  condensed,  encyclopedic  treatment  of  a  large  literature  on  attention  and 
performance  is  equivalent  to  over  200  ordinary  book  pages  plus  more  than  100  figure  panels. 
Concepts  such  as  formal  task  definitions,  compound  and  concurrent  tasks,  attentional  resources, 
attentional  operating  characteristics,  and  more  generally,  strategies  to  optimize  performance,  are 
applied  to  the  interpretation  of  data  from  many  classical  paradigms.  This  yields  a  deeper 
understanding  and,  in  many  instances,  vastly  different  conclusions. 


Attentional  Trajectories. 

The  Wilson  Cloud  Chamber  and  Glaser  Bubble  Chamber,  which  are  designed  to  make 
visible  the  trajectories  of  individual  atomic  and  subatomic  particles,  work  by  populating  the 
volume  within  which  a  particle  will  move  with  steam  or  superheated  liquid.  When  a  target 
particle  moves  thru  the  chamber,  a  few  of  the  molecules  it  strikes  form  the  nucleus  of  condensing 
droplettes  or  evolving  bubbles,  and  the  visible  track  of  these  droplettes  or  bubbles  defines  the 
trajectory. 

Sperling  and  Reeves  (1980)  introduced  an  analogous  procedure  in  the  realm  of 
measurements  of  human  attention.  A  rapid  stream  of  superimposed  visual  items  was  presented  at 
rates  of  up  to  13  per  second  in  a  single  spatial  location.  Subjects  attended  a  second  location.  At 
a  critical  moment  during  the  sequence,  subjects  were  cued  to  execute  a  shift  of  attention  to  the 
stream  location,  and  to  report  the  earliest  four  of  the  items.  The  historgram  (distribution)  of  the 
actually  reported  items  (a  small  fraction  of  the  presented  items)  defined  the  rapid  growth  and 
subsequent  decline  of  attention  at  the  stream  location.  This  paradigm  made  it  possible  to  measure 
reaction  times  of  shifts  of  visual  attention.  Indeed,  the  paradigm  allows  the  measurement  not  only 
of  the  mean  reaction  time  of  an  attentional  shift  but  of  the  entire  density  function  of  attention^ 
reaction  times  (ARTs).  Mean  ARTs  were  shown  to  be  quite  similar  to  motor  reaction  times 
(MRTs)  and  to  covary  with  MRTs  in  response  to  factors  such  as  task  difficulty  and  target 
predictability. 

Reeves,  A.,  and  G.  Sperling.  (1986)  Attention  gating  in  short-term  visual  memory.  Psychological 
Review,  93,  180-206. 
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This  paper  offers  a  computational  model  of  a  shift  of  visual  attention,  greatly  enlarging  on 
the  procedures  of  Sperling  &  Reeves  (1980).  An  attention  shift  takes  attention  from  its  initial 
location  n  to  a  second  location  b .  While  attention  is  focussed  at  a ,  stimulus  information  from  a 
is  admitted  to  further  processing,  and  stimulus  information  from  b  is  excluded.  After  the  shift, 
the  roles  of  a  and  b  are  reversed.  The  process  of  shifting  attention  to  b  is  conceptualized  as  the 
opening  of  an  attentional  gate  mb.  In  Reeves  and  Sperling’s  (1980)  attentional  task,  location  b 
contains  a  rapid  stream  of  characters,  so  the  attention  gate  remains  open  at  b  only  for  a  for  a  brief 
period  to  avoid  flooding  memory  with  irrelevant  items. 

The  theory  assumes  that  the  fraction  of  stimulus  information  passed  on  to  higher  mental 
processes  from  a  location  in  space  and  a  moment  in  time  is  proportional  to  the  attentional 
allocation  at  that  location.  The  theory  contains  only  three  parameters:  First,  there  is  a  latency 
between  the  signal  to  shift  attention  and  the  start  of  the  attention  shift.  Second,  the  time  course  of 
gate  opening  is  described  by  a  second-order  gamma  function  with  a  time  constant,  typically,  of 
several  hundred  msec.  Third,  there  is  the  amplitude  of  internal  noise  that  determines  the  signal- 
to-noise  ratio  of  the  internally  represented  information. 

The  data  set  is  quite  complex,  and  the  theory  makes  accurate  predictions  of  literally 
hundreds  of  data  points  with  these  few  parameters. 


Sperling,  George.  The  magical  number  seven:  Information  processing  then  and  now.  In  William 
Hirst  (Ed.),  The  making  of  cognitive  science:  Essays  in  honor  of  George  A.  Miller.  Cambridge, 
UK:  Cambridge  University  Press,  1988. 

This  article  analyzes  why  the  magical  number  7  +-2  had  such  a  major  impact  on  cognitive 
science  --it  is  the  most  cited  experimental/theoretical  article  in  Psychology.  The  article  7-i~2 
offers  a  theoretical  account  of  absolute  judgment  (sensory  categorization)  experiments  and  of 
short-term  memory  experiments.  Both  kinds  of  experiments  have  a  limit  of  7  (bits,  and  items, 
respectively).  There  are  no  self-citations  in  the  references.  All  of  the  evidence  Miller  used  was 
publically  available.  Miller,  like  Sherlock  Holmes,  was  the  one  who  was  able  to  formulate  a 
theory  to  encompass  these  data,  and  it  was  perhaps  the  first  plausible  quantitative  theory  to  deal 
with  the  microprocess  of  cognition. 

The  second  part  of  the  analysis  deals  with  the  current  status  of  Miller’s  proposals.  Miller’s 
seven-item  limit  turns  out  to  depend  on  factors  such  as  acoustic  confusability,  implying  that  the 
item  limit  is  based  on  a  sensory-based  acoustic  memory  rather  than  an  abstract  memory.  The 
review  then  points  out  that  a  single  memory  system-a  stack  of  seven  items-can  encompass  both 
the  bit  and  the  item  limits  Miller  had  proposed.  In  a  sensory  categorization  experiment,  the  seven 
items  in  working  memory  are  items  with-respect-to-which  new  items  are  judged.  In  a  short-term 
recall  experiment,  they  are  the  to-be-recalled  items.  Such  a  stack  memory  is  easily  embodied  in  a 
neural  network.  Thus,  a  simple  neural  network  memory  model  can  encompass  the  two  main 
tenets  of  Miller’s  magical  number  seven. 


Weichselgarmer,  E.,  and  George  Sperling.  (1987)  Dynamics  of  automatic  and  controlled  visual 
attention.  Science,  238,  778-780. 
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Uses  the  Sperling  &  Reeves  (1980)  paradigm  to  isolate  and  measure  the  partially  concurrent 
time  courses  of  automatic  and  controlled  attentional  shift.  The  automatic  component  is  extremely 
rapid,  very  brief  in  duration,  and  relatively  effortless.  The  controlled  component  has  the  same 
time  course  as  the  previously  measured  attention  shifts  (Sperling  &  Reeves,  1980;  Reeves  & 
Sperling,  1986),  is  slower,  has  a  longer  duration,  and  is  effortful. 

Sperling,  George,  and  Weichselgartner,  Erich.  (199x).  Episodic  theory  of  the  dynamics  of  spatial 
attention.  Psychological  Review.  (Under  revision.) 

This  paper  re-analyzes  previous  measurements  of  visual  attention  in  simple  reaction-time, 
choice  reaction-time  and  complex  discrimination  experiments  in  which  attention  was  purported  to 
move  continuously  across  space.  All  these  data  plus  data  from  attention  gating  experiments  were 
shown  to  be  quantitatively  predicted  by  a  quantal  (episodic)  theory  of  spatial  attention  that 
proposes  instead:  (a)  visual  attention  can  be  resolved  into  a  sequence  of  discrete  attentional  acts 
(episodes):  (b)  each  attentional  episode  is  defined  by  its  spatial  facilitation  function /(jc,y);  (c)  the 
transition  at  time  Iq  between  episodes  is  described  by  a  temporal  alerting/gating  function  G(t  -  to)', 
(d)  /  and  G  are  space-time  separable.  In  support  of  the  theory,  new  experiments  are  reported 
that  use  a  concurrent  motor  reaction-time  task  to  assess  changes  m  discriminability  with  distance. 
When  non-attentional  factors  are  corrected  for,  the  duration  of  an  attention  shift  is  independent  of 
the  spatial  distance  traversed  and  of  the  presence  or  absence  of  interposed  visual  obstacles.  New 
experiments  that  test  and  confirm  the  theory  are  reported. 


Gegenfurtner,  K.  and  Sperling,  G.  (1993).  Information  transfer  in  iconic  memory  experiments. 
Journal  of  Experimental  Psychology;  Human  Perception  and  Performance,  1993,  19,  845-866. 

This  paper  investigates  the  role  of  selective  and  nonselective  transfer  processes  in  partial 
reports  of  information  from  briefly  exposed  letter  arrays.  In  order  to  report  letters,  viewers  must 
transfer  information  from  a  rapidly  decaying  persistence  trace  (iconic  memory)  to  a  more  durable 
short  term  memory.  At  some  time  following  termination  of  the  display,  subjects  are  cued  to 
report  a  particular  row  of  letters.  Transfer  that  occurs  prior  to  the  cue  is  nonselective;  transfer  that 
occurs  after  the  cue  is  selective,  (a)  Performance  is  unaffected  by  10:1  variations  in  the 
probabilities  of  short  and  long  cue  delays.  This  implies  that  viewers  use  the  same  transfer 
strategies  at  all  cue  delays,  (b)  Information  transfer  that  has  occurred  at  various  times  t  before 
and  after  the  cue  is  measured  by  using  a  post-stimulus  mask  at  lime  t  to  eliminate  visual 
persistence.  Nonselective  and  selective  information  transfer  (before  and  after  the  cue)  are  shown 
to  combine  additively.  (c)  Positions  within  rows  differ  substantially  in  their  accuracy  of  report. 

A  simple  model  accounts  for  partial  report  (cued)  performance  at  different  cue  delays  both 
with  and  without  a  mask,  and  for  whole  report  (uncued)  performance.  (1)  The  time  course  of 
iconic  legibility  after  stimulus  termination  depends  on  the  retinal  location  (row).  (2)  Initial 
attention  is  directed  to  the  middle  row,  subsequently  it  switches  to  the  cue-designated  row.  (3) 
The  instantaneous  location-specific  legibility  times  the  instantaneous  state  of  attention,  integrated 
over  time,  determines  cumulative  transfer,  subject  to  the  capacity  limit  of  durable  storage.  A 
review  of  earlier  computational  approaches  shows  that  only  this  model  is  capable  of  giving  a  self- 
consistent  account  of  information  transfer  from  iconic  memory. 
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ence  Sponsored  by  BOston  University’s  Want  Institute,  Center  for  Adaptive  Systems,  Tyngs- 
boro,  MA  01879,  May  II,  1991.  Two  Systems  of  Visual  Processing. 

1991  tGeorge  Sperling,  Neural  and  Visual  Computation  Symposium  Center  for  Neural  Sciences  New 
York  University,  NY,  May  31,  1991.  The  Spatial,  Temporal,  and  Featural  Mechanisms  of 
Visual  Attention. 

1991  tGeorge  Sperling,  National  Academy  of  Sciences,  National  Research  Council,  Committee  on 
Vision,  Conference  of  Visual  Factors  in  Electronic  Inuge  Communications,  Woods  Hole,  M 
July  23,  1991.  Empirical  Observations  on  Image  Compression  and  Comprehension. 

1991  tGeorge  Sperling,  The  International  Society  for  Psychophysics,  Washington  Duke  Inn,  Duke 
University,  Durham,  North  Carolina,  New  York  University,  NY  October  19,  1991.  The 
Featural  Mechanism  of  Visual  Attention. 

1991  t*Chubb,  C.,  Solomon,  J.  A.  and  Sperling,  G.  Invited  paper  presented  by  Charles  Chubb.  Opt¬ 
ical  Society  of  America,  San  Jose,  California  November  7,  1991,  Contrast  Contrast  Determines 
Perceived  Contrast. 

1991  ♦George  Sperling  and  Stephen  Wurst,  Paper  presented  by  George  Sperling.  Psychonomic 
Society,  San  Francisco,  California  November  22,  1991.  Selective  Attention  to  an  Item  is  Stored 
as  a  Feature  of  the  Item. 

1992  ♦Shui-1  Shih  and  George  Sperling.  Eastern  Psychological  Association,  Boston,  Massachusetts, 
April  4,  1992.  Cluster  Analysis  as  a  Tool  to  Discover  Covert  Strategy. 

1992  ♦Werkhoven,  P.,  Sperling,  G.,  and  Chubb,  C.  Association  for  Research  in  Vision  and  Ophthal¬ 
mology,  Sarasota,  Florida,  May  6,  1992.  The  Dimensionality  of  Motion  From  Fexture. 

1992  ♦Weiichoven,  W.,  Sperling,  G.,  and  Chubb,  C.  Optical  Society  of  America,  Albuquerque,  New 
Mexico,  September  25,  1992.  Energy  Computations  in  Motion  and  Texture. 
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1992  *Georgc  Sperling  and  Hai-Jung  Wu,  Paper  presented  by  George  Sperling.  Psychonomic 
Society,  Saint  Louis,  Missouri,  November  15,  1992.  Defining  and  Teaching  Objectively  Accu¬ 
rate  Confidence  Judgments. 

1993  *tGeorge  Sperling,  Linking  Psychophysics,  Neorophysiology,  and  Computational  Vision:  A 
Conference  to  Celebrate  Bela  Julesz’  65th  Birthday.  Rutgers  University,  New  Brunswick,  NJ. 
May  1,  1993.  Spatial,  Temporal,  and  Featural  Mechanisms  of  Visual  Attention. 

1993  ♦Solomon,  J.  A.  and  Sperling,  G.  Talk  presented  by  Joshua  A.  Solomon.  Association  for 
Research  in  Vision  and  Ophthalmology,  Sarasota,  Florida,  May  4,  1993.  Fullwave  and 
Halfwave  Rectification  in  Motion  Perception. 

1993  •Shih,  Shui-I  and  Sperling,  G.  Talk  presented  by  Shui-I  Shih.  Association  for  Research  in 
Vision  and  Ophthalmology,  Sarasota,  Florida,  May  6,  1993.  Visual  Search,  Visual  Attention, 
and  Feature-Based  Stimulus  Selection. 

1993  •Lu,  Zhong-Lin  and  Sperling,  G.  (1993)  Talk  presented  by  Zhong-Lin  Lu.  Association  for 
Research  in  Vision  and  Ophthalmology,  Sarasota,  Florida,  May  6,  1993.  2nd-Order  Illusions: 
Mach  bands,  Craik — O'Brien — Comsweet. 

1993  ♦Chubb,  C.,  Darcy,  J.  and  Sperling,  G.  Talk  presented  by  Charles  Chubb.  Association  for 
Research  in  Vision  and  Ophthalmology,  Sarasota,  Florida,  May  6,  1993.  Metameric  Matches  in 
the  Space  of  Textures  Comprised  of  Small  Squares  with  Jointly  Independent  Intensities. 

1993  ♦fSperling,  George  and  Dosher,  Barbara  A.  Talk  presented  by  George  Sperling.  Linking 
Psychophysics,  Neurophysiology  and  Computational  Vision.  A  Conference  to  Celebrate  Bela 
Julesz’  65th  Binhday.  Rutgers  University,  New  Brunswick,  New  Jersey,  May  1,  1993. 
Structure-from-motion:  Algorithms,  Illusions,  Mechanisms. 

1993  fSperling,  George.  Geometric  Representation  of  Perceptual  Phenomena.  A  Conference  in 
Honor  of  Tarow  Indow.  University  of  California,  Irvine.  July  28,  1993.  The  Representation 
of  Motion  and  Texture. 

1993  fSperling,  George.  Society  for  Mathematical  Psychology,  Twenty-Sixth  Annual  Meeting,  Nor¬ 
man,  Oklahoma.  Plenary  lecture.  August  17,  1993.  Second-Order  Perception. 

1993  ♦fSperling,  George.  Ciba  Foundation  Symposium  No:  184.  Higher-Order  Processing  in  the 
Visual  System.  The  Ciba  Foundation,  41  Portland  Place,  London,  UK.  October  21,  1993. 
Full-Wave  and  Half-Wave  Mechanisms  in  Motion  and  Texture  Perception. 

1993  fSperling,  George.  International  .Workshop  on  Digital  Video  for  Intelligent  Systems.  Hosted 
by  Department  of  Electrical  and  Computer  Engineering,  University  of  California,  Irvine,  Cali¬ 
fornia.  December  17,  1993.  An  engineering  model  of  human  visual  processing! Intelligibility  of 
extremely  re  duced  images. 
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George  Sperling:  Invited  Lectures  at  Universities  and  Institutes 

1991  Department  of  Psychology  Colloquium.  University  of  California,  Irvine,  Irvine,  CA,  January  10, 
1991.  Visual  Preprocessing. 

1991  Department  of  Psychology  University  of  California  at  San  Diego,  La  Jolla,  CA,  February  28, 
1991.  Mechanisms  of  Attention. 

1991  University  of  California,  Berkeley  Berkeley,  California,  Joint  Cognitive  Science  Colloquium 
and  Oxyopia  Colloquium  (Optometry  School),  March  22,  1991.  Visual  Preprocessing. 

1991  University  of  California,  Berkeley  Berkeley,  California,  Department  of  Psychology/Cognitive 
Science  Colloquium,  March  22,  1991.  The  Spatial,  Temporal,  and  Featural  Mechanisms  of 
Visual  Attention. 

1991  Bonny  Center  for  the  Neurobiology  of  Learning  and  Memory,  University  of  California,  Irvine, 
Irvine,  CA,  April  8,  1991.  Mechanisms  of  Visual  Attention. 

1991  Salk  Institute,  University  of  California  at  San  Diego,  La  Jolla,  CA,  April  10,  1991.  Visual 
Preprocessing. 

1991  Department  of  Psychology,  University  of  Florida  at  Gainsville,  April  26,  1991.  Systems  and 
Stages  of  Visual  Processing. 

1991  Shanghai  Institute  of  Technical  Physics,  Shangahi,  China,  June  17,  1991.  How  the  Human 
Visual  System  Computes  Visual  Motion  [Host:  Prof.  Kuang,  Ding  Bo  (Director,  SITP);  Transla¬ 
tors:  Dr.  Zhang,  Ming  and  Chen,  Lulin.] 

1991  Department  of  Computer  Science,  Shanghai  Information-Technology  Engineers  Examination 
Center,  Fudan  University,  Shangahi,  China,  June  18,  1991.  Neural  Principles  of  Preprocessing 
for  Human  Pattern  Recognition.  [Host;  Prof.  Wu,  Lide  (Director,  SITEEC).) 

1991  Department  of  Electronic  Science  and  Technology,  Institute  of  Applied  Electronics,  East  China 
Normal  University,  Shangahi,  China,  June  20,  1991.  Measuring  Attention  and  How  the  Human 
Visual  System  Computes  Visual  Motion  (Host:  Prof.  Weng,  Moying  (Chairman  and  Director); 
Translator:  Dr.  Zhang,  Ming.] 

1991  Department  of  Psychology,  Beijing  University,  and  Institute  of  Psychology,  Chinese  Academy 
of  Sciences,  Beijing,  China,  June  25,  1991.  [Host:  Prof.  Jing,  Qicheng  (Director,  Institute  of 
Psychology)] 

Morning:  The  Efficiency  of  Pereception  (Translators:  Dr.  Zhang,  Ken  and  Prof.  Jing, 
Qicheng.] 

Afternoon:  Measuring  Attention.  (Translator  Luo,  Chun-Rong.] 

1991  Computational  Vision  Laboratory,  Institute  of  Biophysics,  Chinese  Academy  of  Sciences,  Beij¬ 
ing,  China,  June  28,  1991.  First-  and  Second-Order  Motion  Perception.  [Host:  Prof.  Wang 
Shuo-Rong  (Director,  Institute  of  Biophysics);  Translator:  Prof.  Wang,  Yun-Jiu  (Laboratory 
Director.] 

1991  New  York  University,  Cognitive  Sciences  Colloquium,  September  12,  1991.  Is  There  Atten- 
tional  Filtering  of  Items  by  Feature  as  Well  as  by  Location? 

1992  Center  for  Adaptive  Systems  Boston  University,  February  25,  1992.  Is  There  Attentional  Selec¬ 
tion  of  Items  by  Feature  as  Well  as  by  Location? 

1992  University  of  Delaware,  Department  of  Psychology  Colloquium,  March  4,  1992.  Can  Visual 
Attentional  Filter  Items  by  Feature? 
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1993  University  of  California,  Irvine,  Department  of  Cognitive  Sciences,  Vision  Lunch  Series,  Janu¬ 
ary  13,  1993.  2nd-Order  Motion  Perception. 

1993  University  of  California,  Irvine,  Bren  Fellows  Program,  Learned  Societies  Luncheon,  UCl 
University  Club,  March  9,  1993.  Modeling  Mental  Microprocesses. 

1993  University  of  California,  Santa  Barbara,  First  Annual  Gottsdanker  Memorial  Lecture  (Depart¬ 
ment  of  Psychology).  May  27,  1993.  A  Theory  of  Spatial  Attention. 

1993  Kenneth  Craik.Qub,  University  of  Cambridge,  Cambridge,  England,  October  25,  1993.  Early 
Visual  Processing. 

1993  University  of  California,  Berkeley.  December  3,  1993.  A  Theory  of  Spatial  Attention. 


